Molecular Dynamics Simulation of Folding and Diffusion of Proteins in Nanopores 
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A novel combination of discontinuous molecular dynamics and the Langevin equation, together 
with an intermediate-resolution model, are used to carry out long (several fis) simulation and study 
folding transition and transport of proteins in slit nanopores. Both attractive {U'^) and repul- 
sive {U~) interaction potentials between the proteins and the pore walls are considered. Near the 
folding temperature Tf and in the presence of the proteins undergo a repeating sequence of 
folding/partially-folding/ unfolding transitions, while Tf decreases with decreasing pore sizes. The 
opposite is true when f/~ is present. The proteins' effective diffusivity D is computed as a function 
of their length (number of the amino acid groups), temperature T, the pore size, and the interaction 
potentials [/^. Far from Tf, D increases (roughly) linearly with T, but due to the thermal fluctua- 
tions and their effect on the proteins' structure near Tf, the dependence of 7? on T in this region is 
nonlinear. Under certain conditions, transport of proteins in smaller pores can be faster than that 
in larger pores. 

PACS: 87.15.Aa, 83.10.Mj, 87.15.Cc, 87.15.Vv, 87.83.+a 



Proteins' importance to biological systems cannot be 
overstated [1]: as enzymes they catalyze and regulate 
cells' activities; tissues are made of proteins, while as 
antibodies proteins are a vital part of the immune sys- 
tem. Proteins with globular structure fold into compact 
and biologically active configurations, and an important 
problem is understanding the mechanisms by which they 
attain their folded structure, and factors that contribute 
to the folding [2-4]. Such understanding is important 
due to debilitating illnesses, such as Alzheimer's and 
Parkinson's diseases, that are believed to be the result of 
accumulation of toxic protein aggregates [5-8] , as well as 
to the industrial production of enzymes and therapeutic 
proteins based on the DNA recombinant method [9]. 

While the three-dimensional (3D) structure of native 
proteins is controlled by their amino acid sequence [2-4] , 
their transport properties and the kinetics of their folding 
depend on the local environment. But, whereas protein 
folding in dilute solutions under bulk condition, typically 
used in in vitro studies, is relatively well-understood, the 
more important problem of protein folding in a confined 
medium is not. The environment inside a cell in which 
proteins fold is crowded, with the volume fraction of the 
crowding agents (such as RNA) may be 0.2-0.3. Thus, 
even in the absence of interactions between proteins and 
other cellular molecules, their movement inside the cell is 
limited. The limitation affects proteins' stability. Exper- 
iments indicated [10] that confinement often stabilizes 
the proteins' native structure [11], denatures them in 
the limited space of the cage model, first suggested by 
Anfinsen [2-4], and accelerates folding relative to that 
in bulk solutions. Studies of proteins of different native 
architectures in cylindrical nanporcs indicated [12] that. 



in vivo folding is not always spontaneous; rather, a subset 
of proteins may require molecular chaperones. 

Protein (enzyme) immobilization using porous solid 
support, via adsorption, encapsulation, and covalcnt link- 
ing, has been used for a long time [13, 14]. Such practi- 
cal applications as biocatalysis [15] and biosensors also 
entail not only better understanding of the folding in 
confined media, but also transport of proteins in such 
media. At the same time, protein purification using 
nanoporous membranes is also gaining attention [16]. 
SiC nanoporous membranes [17] allow [18] diffusion of 
proteins up to 29000 Daltons. but exclude larger ones. 
Despite the fundamental and practical significance of 
transport of protein in confined media, there is currently 
little understanding of the phenomenon. 

The goal of this paper is twofold. First, we use molec- 
ular dynamics (MD) simulation to study protein folding 
and stability in slit nanopores. Second, we utilize a novel 
combination of MD simulation and the Langevin equa- 
tion (LE) to study protein transport in the nanopores. 
To our knowledge, our combination of the MD simula- 
tion and the LE has never been proposed before, nor 
has there been any simulation of transport of proteins 
in nanopores. For such important practical applications 
as membrane purification, biocatalysis, and sensors, the 
transport of proteins in nanopores is of utmost impor- 
tance. A slit nanopore is a reasonable model for the type 
of pores that one encounters in such applications [15-18] 
and. despite its simplicity, it might also be a reasonable 
model for the pores in biological membranes. 

Some Monte Carlo [19] and MD [5, 20] simulations of 
proteins' behavior in nanopores were reported before. In 
particular, Lu et al. [5] and Cheung et al. [20] studied 



folding of proteins in spherical pores of different radii. 
Cheung et al. studied the phenomenon as a function 
of the volume fraction of a crowding agent, which they 
modeled by a bed of hard spheres with repulsive interac- 
tion with the proteins. While a spherical pore may be a 
suitable model for the cavity of GroEl-GroES complex, 
it is not so for the pores of membranes, biocatalysts, and 
sensors that are of prime interest to us. Instead, the slit 
(and cylindrical) pores are more appropropriate. More- 
over, for the types of applications that we consider, the 
pore space consists of interconnected channels, which is 
completely different from what Refs. [5,20] considered. 

In addition, the protein model that we use (see be- 
low) is, in our opinion, much more realistic than what 
the previous investigations [5, 20] utilized. For example, 
they used a simplified model for the amino acids that 
was based on two united atom (UA) beads. Moreover, 
the side chains of the amino- acid residues were not explic- 
itly considered. The model that we utilize represents the 
amino acids using four UA beads (see below), while the 
side chains are also considered explicitly, hence honoring 
the proteins' structure much more realistically. 

We simulate de novo-designed a family of proteins 
[21], which consists of only 4 types of amino acids in 
their 16-residue sequence, simplified further [22] to a se- 
quence of hydrophobic (H) and polar (P) residues. Using 
periodicity in the H-P sequence of the 16-residue peptide 
aiB, we made 3 other sequences with lengths ^ = 9, 23 
and 30 residues. As the four proteins have similar native 
structures, the differences in their behavior is attributed 
to their lengths. The simulations indicated that they all 
fold into an a-helix. 

The proteins are modeled by an intermediate- 
resolution model [23-26] , with several changes described 
below. Every amino acid is represented by four UA 
groups or beads. A nitrogen UA represents the amide 
N and hydrogen of an amino acid, a Cq, UA represents 
the a-C and its H, and a C UA for the carbonyl C and 
O. The fourth bead R represents the side chains, all of 
which are assumed to have the same diameter as CH3. 
All the backbone bond lengths and bond angles are fixed 
at their ideal values, and the distance between consecu- 
tive Cq, UA is fixed according to experimental data. 

To carry out long and efficient simulations, we use dis- 
continuous MD (DMD) [27]. This allowed us to carry 
out 5 IIS MD simulations, one order of magnitude longer 
than the previous simulations. The forces acting on the 
beads are the excludcd-volume effect, and attraction be- 
tween bonded and pseudobonded beads, between pairs of 
backbone beads during HB formation, and between hy- 
drophobic side chains. Nearest-neighbor beads along the 
chain backbone are covalently bonded, as are the Cq and 
R UAs. Pseudobonds are between next-nearest neighbor 
beads along the backbone to keep the backbone angles 
fixed; between neighboring pairs of Cq beads to main- 
tain their distances close to the experimental data, and 



between side chains and backbone N and C UAs to hold 
the side-chain beads fixed relative to the backbone. All 
of this keep the interpeptidc group in the trans configu- 
ration, and all the residues as i— isomers, as required. 

The potential Uij between a pair ij of bonded beads, 
separated by a distance rij, is given by, Uij = 00, for, 
Tij < ^(1 — (5), and rij > 1{1 + 6), and, Uij ~ for 
1{1 — S) < rij < 1{1 + S). Here, I is the ideal bond length, 
and S = 0.02375 is the tolerance in the bond's length 
[23-26]. The hydrophobic (HP) interactions between the 
side chains and the H in the sequence, if there are at 
least 3 intervening residues between them, is given by, 
Uup = 00 ,-eHP, and for, < cthp, cthp < < 
1.5(Thp, and > 1.5crHp, respectively, where (Thp is the 
HP side-chains' diameter. 

The HB interaction may occur between the N and C 
beads with at least 3 intervening residues, but each bead 
may not contribute to more than one HB at any time, 
with the range of the interaction being about 4.2 A. 
The HBs are stable when the angles in N-H-0 and C- 
0-H, controlled by a repulsive interaction between each 
of the N and C beads with the neighboring beads of the 
other one, are almost 180°. Thus, if a HB is formed be- 
tween beads and Cj, a repulsive interaction between 
neighbor beads of N^, namely, C^-i and Cq^, with Gj 
is assumed, and similarly for the neighbor beads of Cj, 
namely, Nj_|-i and Cqj, with the bead. 

An N or C bead at one end of the protein has only 
one neighbor bead in its backbone, instead of 2. Hence, 
controlling the HB angles will be limited, causing the 
HBs with one of their terminal constituents to be less re- 
stricted and, thus, more stable than the other HBs. This 
may cause formation of non-a-helical HBs in a part of the 
protein between the N and C beads, and of semistable 
structures that influence the results. To address this 
problem, assume that the N-terminal bead, Ni, has a 
HB with Cj. For i = 1, bead Ci-i does not exist to 
have a repulsive interaction with Cj and help control the 
HB angles. So, we use Cqi. Not only can we consider 
the repulsion between this bead and Cj, but also we de- 
fine an upper limit for their distance so as to control 
the motion freedom of Ni and Cj that constitute the 
beads in the HB. The potential Uki of such interactions 
is given by, Uki =00, ehb, 0, and 00 for ru < ^{ak+cn), 
li<yk + m) < ru < di, di < ru < ^2, and, r^ > d2, re- 
spectively. 

Two H atoms have chemical bonds with the nitrogen 
in the proteins' N-terminal, and are free to rotate around 
the Ni-Cqi bond, while at the same time satisfying the 
constraints on the angles between the chemical bonds of 
Ni. Thus, if a HB is formed, one of the two H atoms 
lies in the plane of N, O and C, such that the angles 
in N-H-0 and C-O-H are as close to 180° as possible. 
Hence, we force the maximum distance between Cqi and 
Cj to be the same as the maximum distance d2 between 
Cq; and Cj in the usual HBs, and similarly when the C- 



terminal Ce has a HB with N^. This allows us to control 
the angles in a HB that contains Ni. The T— dependence 
(T is dimensionless) oi d2 (in A), obtained from separate 
MD simulations (the details will be given elsewhere), is, 
d2 ~ 5.53- 0.019/r for Ni-Caj, and da ^ 5.69-0.044/T 
for d-Cai- 

There is also hard-core repulsion between two un- 
bonded beads that have no HB and HP interactions. At 
the same time, interactions between a pair of beads, sep- 
arated along the chain by 3 or fewer bonds, are more 
accurately represented by those between the atoms them- 
selves, not the UAs. Thus, we developed a variant of the 
previous models [23-26] to account for such interactions: 
the beads are allowed to overlap by up to 25% of their 
bead diameters, while for those separated by 4 bead di- 
ameters the allowed overlap is 15% of their diameters. 

We use a slit nanopore, modeled as the space between 
two 2D structureless carbon walls in the xy plane be- 
tween z = ±/i/2, with periodic boundary conditions in 
the X and y directions. The interaction between the walls 
and the protein beads is, fJpvv = oo, — epwi 0, — epvv, 
and oo for, zx < -{h/2 - d^x), -{h/2 - d^x) < zx < 
-{h/2 - dsx ~ dix), -{h/2 - dsx - d^x) < zx < 
h/2 - dsx ~ dix, h/2 - dax - d^x < zx < h/2 - d^x, 
and Zx > h/2 ~ d^x, respectively, where zx is the dis- 
tance between the center of a bead X and the walls. For 
all the beads, epw — |eHB, so chosen to represent re- 
alistically the competition between protein folding and 
its beads' interaction with the walls. To estimate d^x 
and d^x, the energy and size parameters between the C 
atoms in the walls and various beads were calculated us- 
ing Lorentz-Berthelot mixing rules, acx = ^{^c + <^x), 
and ecx = y/^c^x, where X =N, Cq., C and R. Then, us- 
ing separate simulations (details will be given elsewhere), 
the interaction potential Ucx between different beads 
was estimated. The distances at which Vex and its sec- 
ond derivative were zero were taken as d^x and d^x +d4x ■ 
The results (in A) are, dsx = 2.85, 3.02, 3.14, and 3.31, 
and d4x = 0.96, 1.01, 0.98, and 1.12, for X =N, C^, C, 
and R, respectively. 

We now describe how the effective diffusivity D of the 
proteins is computed. To do so, one must take into ac- 
count the effect of the solvent on the motion of the pro- 
teins. In the previous works the solvent's effect on the 
proteins' motion was included only implicitly by the HP 
attraction between the side chains. While this might be 
appropriate for studying the folding, it is not so for com- 
puting D, since the solvent's viscosity t] strongly and 
directly affects D. To explicitly include the solvent ef- 
fect, we have developed the following model which, to 
our knowledge, is new. 

We first carry out DMD simulation for a time period 
At. Suppose that the speeds of the proteins' center of 
mass (CM) at the beginning and end of the period At 
are, respectively, Vf, and Vg. Since the solvent's viscosity 
affects the proteins' velocity in the pore, but the time 



scale over which this effect is important is much differ- 
ent from At, we apply, at the end of the time period 
At, the Langevin equation (LE) to the proteins' CM to 
correct their velocity due to the presence of the solvent's 
molecules. To do so, we represent a protein as a particle 
with a mass m and an effective radius equal to its radius 
of gyration Rg. Then, the force F on its CM is given by, 

F = TO(Ve - Vb)/Ai . (1) 

The discretized LE is given by, 

Av = v„ - Vf, = FAt/m - ^VbAt/m + AR(Ai) , (2) 

where, ^ = QirRgrj, AR is a Gaussian random force (with 
zero mean and variance 2kBTS^At / m?), and v,i is the 
speed after applying the LE (acting as the for the 
next LE application). Thus, 

dv = v„ - Ve = -CAtvfc/m + AR(At) , (3) 

which yields v„ , the velocity of the proteins corrected for 
the solvent effect. The DMD simulation is then continued 
for another time period At using v„ as the v^, the LE is 
applied again to correct the proteins' velocity at the end 
of the period, and so on. 

We now present the results of our simulations. Con- 
sider the case of attractive interaction potential C/+ be- 
tween the proteins and the pore's walls. A folded state 
attaches itself to the walls only through its end groups, 
while unfolded ones may completely attach themselves 
to the walls. Thus, with the decrease in the average 
potential energy of the unfolded states is larger than the 
corresponding decrease for the folded one. Hence, com- 
pared to the bulk, the unfolded states in the pore with 
a U'^ are more stable than the folded one, in qualitative 
agreement with Refs. [4,20] for spherical cavities. 

Figure 1 shows a sequence of events for a protein of 
size ^ = 16 in a pore of size /i = 1.75 nm at Tf ~ 0.13. 
The protein changes its state from completely folded to 
a partially folded to an unfolded one which is completely 
attached to the pore's wall (frame D). Due to the 
transitions occur easily and repeatedly, even after a long 
time. Note that, in moving from B to C, the set of de- 
formed a— helical HBs changes, hence indicating rapid 
dynamics of the HB formation and deformation near Tj. 

Also shown in Figure 1 are protein configurations 
(frames E and F with pore size /i = 1.5 nm) for £ = 9 at 
T = 0.08. In these pores, a protein of length £ = 9 does 
not attain its native state at low T. Instead, it has a U 
shape with its two sides attached to the walls; it has 4 
HBs, only one of which is a— helical (the native state has 
5 a— helical HBs), and more of its atoms are close to the 
walls than those in the folded state. Although the po- 
tential energy of such unfolded states is roughly the same 
as one in the folded one, entropic effects which favor the 
unfolded states are also important. Upon further cool- 
ing at T below the apparent folding temperature Tf , the 



protein becomes trapped in the U shape without enough 
kinetic energy to overcome the energy barrier to attain a 
folded state. Thus, such configurations do not represent 
truly folded states. 

We also find that T/ decreases with the pore size, which 
is due to the attractive interaction energy between the 
protein and the walls. The decrease in Tf is indicative 
of the more stable unfolded states (or less stable folded 
state). 

The opposite (namely, increasing Tj with decreasing 
pore size) is true if the interaction is purely repulsive, 
U~ . In that case, proteins are even more confined in 
the pores. Thus, the number of possible unfolded states 
and, hence, their total entropy, will be smaller [28]. But, 
due to its compact configuration, confinement affects the 
entropy of the folded state less strongly, implying that, 
in the presence of U~ the folded state is more stable than 
that in the bulk. 

Figure 2 presents the average interaction energy (Upw) 
of the proteins with the walls, computed by the weighted 
histogram analysis method [29] . Cooling the proteins at 
T > Tf increases \{Upw)\, as well as the average (ua) 
of the a— helical HBs (which is, however, very small). 
Near T^ {ria) is nonnegligible, and the proteins can only 
laterally attach themselves to the walls, hence decreasing 
|(C/pw)|- By lowering T further, nearly the entire a- helix 
is formed, and [(J/pw)! increases again. Thus, Figure 
2 indicates that, not only does the interaction with a 
nanopore disturb folding, but also folding to a definite 
structure disturbs the proteins' interaction with the pore. 

Before describing the results for the effective diffusivi- 
ties, we should point out that, over the temperature range 
that we have simulated, the proteins resemble a prolate 
ellipsoid. Thus, they become elongated in very small 
nanopores, while they can "stand up" in larger pores, 
such that their ends touch the pores' walls, if they are 
long enough. Such configurations are also shown in Fig- 
ure 1. Thus, transport in small pores may actually be 
faster than in larger pores, a counterintuitive, but im- 
portant result. Such phenomena can complicate further 
delineation of the dependence of D on the various impor- 
tant parameters of the system. 

Our simulations indicate that transport of the proteins 
in the xy planes (parallel to the pore's walls) is Fickian, so 
that, after a sufficiently long time (which depends on the 
proteins' length, the pore size, and U^) the mean-square 
displacements (MSDs) of the proteins' CM vary linearly 
with the time, hence yielding an effective diffusivity D 
for the proteins. In contrast, the MSDs in the direction 
perpendicular to the pore's walls saturate at a finite time. 

Figure 3 presents the proteins' effective diffusivity D in 
the xy planes, in a pore with fixed h/ Rg = A.l. As T in- 
creases and Tj is approached, D also increases, since the 
proteins' Rg decreases. Far from Tf (where Rg changes 
little) D varies roughly linearly with T. However, near 
Tf, there are strong thermal fluctuations due to which Rg 



strongly varies with T. Consequently, D no longer varies 
linearly with T. Increasing the proteins' length decreases 
D, as expected. 

Figure 4 presents the diffusivities for a protein of fixed 
length, € = 16, in four different pores, and compares 
the results with those in the bulk. Note that, the Tf 
for the bulk state and those for the pores are not the 
same. Therefore, due to the strong fluctuations of Rg 
and D in the region around T/, the pore and bulk dif- 
fusivities cannot be directly compared. The comparison 
is possible mainly at temperatures far from Tf (both be- 
low and above Tf). As Figure 4 indicates, at least for 
the i^buik > ^porc (taking into account the numerical 
uncertainties. 

The results shown in Figures 3 and 4 were for an at- 
tractive potential [/+ between the proteins and the pores' 
walls. Figure 5 compares the diffusivities for a protein of 
length i = 16 in a pore of size h = 1.75 nm for both at- 
tractive and repulsive interaction potentionals [/^, and 
compares them with the bulk values. Generally speak- 
ing, for T < Tf the diffusivities with U~ are larger than 
those with If^ . 

In our current work, we are carrying out extensive sim- 
ulations for computing the effective diffusivity D as a 
function of the strength of the interaction potential C/pw 
between the proteins and the walls, its sign (attractive 
versus repulsive), and other relevant parameters. The 
results will be presented in a future paper. 
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FIG. 1. Various configurations (A-D) that a protein of lengtli i = IG tafces on in a pore at T/ ~ 0.13. Blue, gray, 
green, and pink spfieres sliow, respectively, the N bead, C^, C, and the side chains. In between the walls and the thin 
lines at a distance ds, Upw ~ oo, beyond which the acts for a distance ^4. Times t are in ps. Frames E and F show 
a short protein (lipid) with £ = 9 that has not reached its native state. Frames G and H show the configurations of a 
protein parallel and perpendicular to the walls. 
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FIG. 3. Dependence of the effective diffusivity D on the temperature T for various protein length £, in a pore with 
h/Rg =4.1. Arrows indicate the location of the folding temperature T/. 
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